IIT Bombay's English-Indonesian submission at WAT: Integrating Neural Language Models with SMT

نویسندگان

  • Sandhya Singh
  • Anoop Kunchukuttan
  • Pushpak Bhattacharyya
چکیده

This paper describes the IIT Bombay’s submission as a part of the shared task in WAT 2016 for English–Indonesian language pair. The results reported here are for both the direction of the language pair. Among the various approaches experimented, Operation Sequence Model (OSM) and Neural Language Model have been submitted for WAT. The OSM approach integrates translation and reordering process resulting in relatively improved translation. Similarly the neural experiment integrates Neural Language Model with Statistical Machine Translation (SMT) as a feature for translation. The Neural Probabilistic Language Model (NPLM) gave relatively high BLEU points for Indonesian to English translation system while the Neural Network Joint Model (NNJM) performed better for English to Indonesian direction of translation system. The results indicate improvement over the baseline Phrase-based SMT by 0.61 BLEU points for EnglishIndonesian system and 0.55 BLEU points for Indonesian-English translation system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Recurrent and Convolutional Architectures for English-Hindi Neural Machine Translation

In this paper, we empirically compare the two encoder-decoder neural machine translation architectures: convolutional sequence to sequence model (ConvS2S) and recurrent sequence to sequence model (RNNS2S) for English-Hindi language pair as part of IIT Bombay’s submission to WAT2017 shared task. We report the results for both English-Hindi and HindiEnglish direction of language pair.

متن کامل

Japanese to English Machine Translation using Preordering and Compositional Distributed Semantics

The pipeline of modern statistical machine translation (SMT) systems consists of several stages, presenting interesting opportunities to tune it towards improved performance on distant language pairs like Japanese and English. We explore modifications to several parts of this pipeline. We include a preordering method in the preprocessing stage, a neural network based model in the tuning stage a...

متن کامل

Towards an Indonesian-English SMT System: A Case Study of an Under-Studied and Under-Resourced Language, Indonesian

This paper describes a work on preparing an Indonesian-English Statistical Machine Translation (SMT) System. It includes the creation of Indonesian morphological analyzer, MorphInd, and the composing of an Indonesian-English parallel corpus, IDENTIC. We build an SMT system using the state-of-the-art phrase-based SMT system, MOSES. We show several scenarios where the morphological tool is used t...

متن کامل

The IIT Bombay SMT System for ICON 2014 Tools Contest

In this paper, we describe our submission to the ICON 2014 Tools Contest for Machine Translation. The source languages are English, Marathi, Tamil, Telugu, Bengali and the target language is Hindi. We submitted 15 systems; 5 each for the tourism, health and general domains. Our submission is a Phrase-based Statistical Machine Translation system with preprocessing and post-processing elements. A...

متن کامل

Forest-to-String SMT for Asian Language Translation: NAIST at WAT 2014

This paper describes the Nara Institute of Science and Technology’s (NAIST) submission to the 2014 Workshop on Asian Translation’s four translation tasks. All systems are based on forest-to-string (F2S) translation, in which the input sentence is first parsed using a syntactic parser, then a forest of possible syntactic analyses is translated into the target language. In addition to the baselin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016